Recurrent Neural Networks With Column-Wise Matrix–Vector Multiplication on FPGAs

نویسندگان

چکیده

This article presents a reconfigurable accelerator for REcurrent Neural networks with fine-grained cOlumn-Wise matrix–vector multiplicatioN (RENOWN). We propose novel latency-hiding architecture recurrent neural network (RNN) acceleration using column-wise multiplication (MVM) instead of the state-of-the-art row-wise operation. hardware (HW) can eliminate data dependencies to improve throughput RNN inference systems. Besides, we introduce configurable checkerboard tiling strategy which allows large weight matrices, while incorporating various configurations element-based parallelism (EP) and vector-based (VP). These optimizations exploitation increase HW utilization enhance system throughput. Evaluation results show that our design achieve over 29.6 tera operations per second (TOPS) would be among highest field-programmable gate array (FPGA)-based designs. Compared accelerators on FPGAs, achieves 3.7–14.8 times better performance has utilization.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficient Recurrent Neural Networks using Structured Matrices in FPGAs

Recurrent Neural Networks (RNNs) are becoming increasingly important for time series-related applications which require efficient and real-time implementations. The recent pruning based work ESE (Han et al., 2017) suffers from degradation of performance/energy efficiency due to the irregular network structure after pruning. We propose block-circulant matrices for weight matrix representation in...

متن کامل

On Serial Multiplication with Neural Networks

In this paper we propose no learning based neural networks for serial multiplication. We show that for “subarray-wise” generation of the partial product matrix and a data transmission rate of -bit per cycle the serial multiplication of two n-bit operands can be computed in n serial cycles with an O(n ) size neural network, and maximum fan-in and weight values both in the order of O( log ). The ...

متن کامل

Recurrent neural networks based Indic word-wise script identification using character-wise training

This paper presents a novel methodology of Indic handwritten script recognition using Recurrent Neural Networks and addresses the problem of script recognition in poor data scenarios, such as when only character level online data is available. It is based on the hypothesis that curves of online character data comprise sufficient information for prediction at the word level. Online character dat...

متن کامل

On Multiplicative Integration with Recurrent Neural Networks

We introduce a general and simple structural design called “Multiplicative Integration” (MI) to improve recurrent neural networks (RNNs). MI changes the way in which information from difference sources flows and is integrated in the computational building block of an RNN, while introducing almost no extra parameters. The new structure can be easily embedded into many popular RNN models, includi...

متن کامل

Reconfigurable Sparse Matrix-Vector Multiplication on FPGAs

executing memory-intensive simulations, such as those required for sparse matrix-vector multiplication. This effect is due to the memory bottleneck that is encountered with large arrays that must be stored in dynamic RAM. An FPGA core designed for a target performance that does not unnecessarily exceed the memory imposed bottleneck can be distributed, along with multiple memory interfaces, into...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Transactions on Very Large Scale Integration Systems

سال: 2022

ISSN: ['1063-8210', '1557-9999']

DOI: https://doi.org/10.1109/tvlsi.2021.3135353